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ABSTRACT 



The Simplex algorithm, developed by George B. Dantzig in 1947 represents a 
quantum leap in the ability of applied scientists to solve complicated linear optimiza- 
tion problems. Subsequently, its utility in solving finite models, including applications 
in transportation, production planning, and scheduling, have made the algorithm an 
indispensible tool to many operations researchers. 

This thesis is primarily an exploration of the simplex algorithm, and a dis- 
cussion of the utility of the algorithm in unconventional optimization problems. The 
mathematical theory upon which the algorithm is based and a general description 
of the algorithm are presented. The reader is assumed to have little exposure to 
convexity, duality, or the Simplex algorithm itself. More important to the thesis are 
the examples that accompany the discussion of the Simplex algorithm. Herein are a 
variety of unusual applications for the algorithm, including applications in infinite di- 
mensional vector spaces, uniform approximation, and computer assisted tomographic 
image reconstruction. These examples serve both to facilitate a better understanding 
of the algorithm, and to present it in unusual settings. 
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I. 



OVERVIEW OF THESIS 



A. THESIS OBJECTIVES 

The development of the Simplex algorithm by Dantzig in middle of this century 
represents a milestone in linear optimization techniques. The impact of Dantzig's 
work is profound. Results of his work include the revival or introduction of a number 
of mathematical disciplines, including convexity and duality theories. Applications 
for the Simplex algorithm, and the accompanying refinements, are vast, and many 
continue to explore new and diverse applications. 

The majority of the research on linear optimization problems is taking place 
in various fields of Operations Research. Of course, the Simplex algorithm itself 
is particularly well suited to problems in that particular discipline, rendering rapid 
solutions to production planning models, transportation problems, and a variety of 
other “real world” applications. A great deal of work was done up to the early 
1970’s in attempts to mold the Simplex algorithm into an engineering and theoretical 
mathematical tool. With the advent of more sophisticated computer hardware and 
software, there may be utility in reconsidering the role of the Simplex algorithm in 
control, approximation, and other infinite dimensional applications. 

This document is intended to serve two main purposes. First, the thesis is 
intended to serve as an introduction to linear optimization and to the Simplex algo- 
rithm, or a theoretical review for readers already familiar with these topics. Second, 
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it is intended to present less traditional problems in a manner that is suitable for 
solution with the Simplex algorithm. 

B. THESIS FORMAT 

The thesis is broken into three parts. The first part, consisting of the first 
two chapters, is devoted to describing sample problems with which the theory of the 
Simplex algorithm is illustrated. Also image reconstruction is introduced, a problem 
whose solution by the Simplex algorithm highlights the thesis. These examples are 
more fully developed in the latter sections. 

The first example is particularly unusual, as we find an orthogonal basis of 
the infinite dimensional vector space L 2 [ 0, 1]. To the author’s knowledge, this is the 
first attempt to use the Simplex algorithm in this capacity. The formulations that 
result from this problem are particularly easy to understand, and lend a great deal 
of understanding to concepts underlying the Simplex algorithm. 

The second example may be found infrequently in literature on linear opti- 
mization. We seek the best approximation to the exponential function over a closed 
interval in the uniform norm sense. That is, we formulate a uniform approximation 
problem as a linear optimization problem. The formulation is used primarily in the 
discussion of duality. 

The final example is again a novel one. We formulate the problem of computer 
assisted tomographic (CAT) image reconstruction as a linear optimization problem, 
and solve a small sample problem with the Simplex algorithm. 
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The second portion, consisting of Chapters IV and V, introduces the machinery 
behind the Simplex algorithm, culminating with a brief introduction to the algorithm 
itself. Chapter IV is an exploration of convexity, both as it pertains to sets and 
functions. The major emphasis of the chapter is on convex subsets of 7v n . Chapter V 
builds on the convexity results as they pertain to duality. Fundamental concepts of 
duality are presented in this chapter, and it concludes with a generic description of 
the algorithm. 

The thesis concludes with the formulation of the image reconstruction problem 
as a linear optimization problem in the general case. The first portion of the chapter 
is devoted to the formulation, followed by the statement of the dual problem. Finally, 
a sample problem is solved, and some analysis of the appropriateness of the Simplex 
algorithm as a solution tool for this particular problem is offered. 
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II. 



PRELIMINARIES 



A. OVERVIEW 

We devote this chapter to the preliminaries of linear optimization. We begin 
by defining three very different examples, which we develop as a means to explore 
linear optimization methods. We then define the optimization problem in general, 
and the linear optimization problem specifically. We close with a synopsis of the 
assumptions that characterize the linear optimization problem. 

B. FIRST EXAMPLES 

This thesis extensively discusses three examples. We begin by stating two of 
our three examples to which we refer throughout the thesis. Because of its complexity 
and importance to this work, the third example is treated separately. 

1. Example 1: Generation of an Orthogonal Basis for 

L 2 [ 0, 1] 

Our first example is one of importance in many areas of approximation. 
We wish to find some orthogonal basis for an infinite dimensional vector space. The 
utility of such bases may be found in any elementary linear algebra or applied math- 
ematics text. The interested reader is referred to [Ref. 1]. The specific vector space 



4 



with which we are concerned is the space of functions defined by 



£ 2 [°> 1] = if : ll/ll = f(x) 2 dx^j < oc 



We note that the above norm is induced by the inner product, 



(f-g) = [ f(x)g(x) dx. 

Jo 



That is, 

imi = 

In particular, we seek an orthogonal polynomial basis, and derive an 
optimization technique to find a polynomial p n , of order n, when we are given a 
set of orthogonal polynomials, Po,pi , . ■ . ,p n - 1 • Recursive application of a method for 
generating p n leads to a complete set of basis polynomials. The polynomial basis is 
of particular importance, as the Weierstrass Approximation Theorem assures us that 
any continuous function /, defined on [0,1], may be approximated arbitrarily well 
with polynomials [Ref. 2]. 

There are a number of existing techniques for the generation of or- 
thogonal polynomials. For example, the Gram-Schmidt algorithm may be applied to 
the sequence {1, x, x 2 , . . . , x n , . . .}. Another approach involves solving a three-term 
recurrence that generates the polynomials. We consider an optimization approach, 
in which we formulate an optimization problem whose solution gives us p n . It is an 
approach that is suitable for inductive iteration. 
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2. Example 2: Uniform Approximation of the Expo- 
nential Function 

The second example is a specific problem in uniform approximation. 
We seek the linear combination of polynomials on the interval [0,3] that best ap- 
proximates the exponential function in the uniform norm sense. The problem, con- 
sequently, is to find the coefficients a t , that minimize the expression 

max I 

fe[0,3] 

where f(t) = ^ a t t'. 

i=0 

We consider specific cases of this example. That is, we seek the polynomial for some 
fixed degree, n, that best approximates the exponential function. Note that the 
uniform approximation problem is fundamentally an optimization problem. The use 
of the Simplex algorithm to solve the problem is, however, unusual. 

C. EXAMPLE 3: THE IMAGE RECONSTRUCTION 

PROBLEM 

The third example is the image reconstruction problem. As with the first two 
examples, there are many existing techniques for solving this problem. Unlike the 
others, however, this is an active area of modern research, and the best methods 
of solution may yet be unknown. The reader is referred to [Ref. 3] for a thorough 
treatment of the problem, and to [Ref. 4] and [Ref. 5] for an introduction to some 
recently developed solution techniques. 
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Suppose a neurosurgeon wishes to rule out the possibility that a patient, Fred, 
suffers from a brain tumor. Further, the physician opts to make use of the CAT 
(Computer Aided Tomography) scan device, and examine the inside of Fred's head 
without exploratory surgery. 

The CAT scan machine works by projecting a finite number of X-rays of known 
intensity into the patient’s head from a finite number of positions. The intensity of 
the X-rays upon leaving Fred’s head is measured. The intensity of the emergent X-ray 
depends essentially on the density of Fred's head over the locations through which 
the X-ray passes. Having collected data from a number of X-rays, the gathered data 
are processed, forming a model of the density of Fred’s head. That is, the processing 
of the data results in the construction of an image, and presumably, an image that 
closely corresponds to the interior of the Fred’s head. This data processing, in this 
example, constitutes solving the image reconstruction problem. 

1. X-Ray Computed Tomography 

Understanding the methods of reconstruction requires that we know the 
process by which the data for reconstruction are obtained. We begin with a basic 
discussion of the manner in which an X-ray moves through an object of homogeneous 
density, then derive the manner in which it moves through more complicated media. 

It has been shown empirically that the fractional decrease in beam 
intensity of a narrow beam of X-ray photons passing through a homogeneous material 



A x 




Figure 1. An X-Ray Over a Homogeneous Object: 7 0 = Input Intensity. 1 = Emer- 
gent Intensity, p = Density. 



is given by the relationship ([Ref. 6]) 



_ 

I ~ ’ 

Jo 

where Io is the X-ray input intensity, and I is the observed intensity after the ray 
passes a distance Ax through the material. See Figure 1. The parameter p is de- 
termined by the density of the material. 1 For two media, the fractional decrease is, 
predictably, 



]_ — „-(*>i(Axi)+p 2 (Ax 2 )) 

l~ ’ 

Jo 



where Ax, denotes the distance the X-ray travels through the i th medium. 



1 p also depends, to a lesser extent, on a number of other factors, including the nuclear composition 
characterized by the atomic number Z, a function of th 'mogeneous material. [Ref. 3] pertains. 
For the purposes of this paper, the effects of other parai. ters are assumed to be nil. 
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Let us partition the media through which the narrow beam travels into 
n homogeneous segments. Denote the density over a single segment by p(x). The 
decrease in this case is expressed by 

j- q = exp . (II. 1) 

Letting n => oo, and Ax, =£> 0, equation (II. 1) becomes 



1 _ 

/o 



lim exp I — ^2 p ( x ») Ax,- 



n— ► oo,Ax— >-0 



i=I 



exp J p{x)dx^j , 



implying 

— In— = f p(x)dx. (II. 2) 

1 o J 

Concluding, let / be the line describing the path of the X-ray, and the function, /(x, y) 
is the density of the media over /. Let ds denote a length over the line /. Equation 



(II. 2) may be written in the form 

— In -L = J'f(x,y)ds. 

2. The Radon Transform 



(II.3) 



This section is an introduction to the Radon Transform, and elaborates 
its relation to the data collected with the X-ray. We first define the transform, then 
briefly describe some of its properties. The discussion in this section pertains to the 
two-dimensional case. That is, we wish to find the density of an object defined over 
some subset of . For generalizations into higher dimensions, see [Ref. 3]. 
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We begin by considering some density function /, defined and bounded 
on a simply connected, compact subset Q C 1Z 2 . Define L to be the set of all straight 
lines passing through any portion of f2. That is, L = {/ : / f] D 7 ^ 0}. Note that 
the cardinality of L is uncountably infinite. The Radon transform is defined by all 
possible line integrals of the form: 

f = [ f{x,y) ds, j « J , (H-4) 

Jij 

where ds is an increment of length along /j, and J is the index set of the set L. 

Consider how the lines, over which the integrals above are computed, 
are determined. Let p — [cos^>, sin <f] T . Then for a fixed angle of rotation (f> and 
a distance p from the origin, we may identify each line, /, by the set of vectors, 
x = [x,y] T , that satisfy the equation 

(x,/r) = x cos<^> + y sin <j) = p. 

(See Figure 2 ). Consequently, we may denote each of the line integrals defining the 
Radon transform by 

/(<M=/ % / (x) dx. (II.5) 

Again, it is vital to note that the Radon transform is defined by the 
collection of all such line integrals. Consequently, to determine the Radon transform 
fully, we must know ) for all values of <j> and p. When we know the value of the 

line integrals for only certain values of <f> and p, we say that we have a sample of the 
transform. 
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y 




Figure 2. The Line, L, as it Relates to ( p , 6) 



3. The Problem Statement 

We note that the right hand sides of Equations (II. 3) and (II. 4) are 
identical. We conclude, then, that if the X-ray is sufficiently narrow, and we are 
able to take an X-ray along all possible lines, the resultant infinite collection of data 
corresponds to the Radon transform of the desired density function. 

The Radon transform has been shown to be one-to-one ([Ref. 3]). That 
is, when all values of the line integrals are known, one may determine the unique 
density that produces the observed transform data. However, in most cases of prac- 
tical interest, we are presented with but a sample of the transform from which to 
reconstruct an image. That is, we are able to collect only a finite number of X-rays. 
Additionally, the photon beam is not sufficiently narrow to be a true line integral 
defining the transform. In this case, inverting the transform is an ill-posed problem. 
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If there exists one density function whose sampled Radon transform equals a given 
data set, then there exist infinitely many density functions, / such that / = b. where 
b is the data obtained from a transform sample. It is this fact that leads us to 
investigate an optimization approach to the image reconstruction problem. 

D. OPTIMIZATION 

Each of the examples can be formulated as an optimization problem. Fun- 
damental to any optimization problem, and to the Linear Optimization Problem, in 
particular, are the concepts of feasible set and objective function. 

1. Feasible Sets 

To help explain a feasible set, we consider an example. Suppose we 
wish to model the production schedule for a baseball and softball manufacturing 
plant. The company is required to make at least 500 baseballs and 1000 softballs 
each month to satisfy contractual agreements. The company expects to procure 2,000 
pounds of stuffing material, and 3,000 square feet of leather covers. Each baseball 
requires ^ pound of stuffing, and | square feet of leather. The requirements for the 
softballs are | pounds and | square feet of stuffing and leather respectively. Then 
of all possible production schedules, we restrict our attention to those that fulfill 
contractual requirements and do not utilize assets which are not available. Let b and 
s be the number of baseballs and softballs, respectively, produced in a month. Then 
we require that: 

b > 500 
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5 > 1,000 



-b+ls < 2.000 

4 8 

lb+js < 2,000. (11.6) 

We have defined a subset of all possible schedules by a group of mathematical re- 
lationships. In this example, the feasible set is the set of all production schedules 
that satisfy the equations of (II. 6). In general, we define the feasible set, Y , to be the 
collection of values satisfying the mathematical relationships imposed by the problem. 

2. The Objective Function 

The objective function, g, defined over a feasible set, V’, is the function 
by which one models the quality of a solution. In the manufacturing schedule example, 
we might logically define the objective function to be profit. Supposing that each 
baseball contributes $1 of profit, and each softball, $.75, we could write our objective 
function: 

g = b -f .75 s, 

and we seek the maximum value of g over Y. 

Simply stated, an optimization problem is expressed by “Considering 
all members of the feasible set, Y , which member(s) results in the optimum value of 
the objective function, gV' 
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E. LINEAR OPTIMIZATION 



The Linear Optimization Problem, or LOP, is defined by the criteria that 
the objective function and the relationships defining the feasible set be linear in our 
decision variables , or the variables representing the values we seek. Then we may 
write the LOP as follows: 



Let a vector c = [cj, C 2 , . . . , c n ] T e 7Z'\ a non-empty index set S, and for every 
s t S a vector a(s) t 7 Z n , and a real number b(s) be given. Defining (u, v) as 
the standard inner product, we seek a vector y e 7Z n , called the optimal vector , 
that minimizes: 

<c,y) 



while satisfying: 
for all s e S. 



(a(s),y) > b(s), 



We observe that a linear maximization problem may be put into the form 
above in the following way. The linearity of the objective function assures us that it 
is continuous on Y , and that the feasible set is compact. Then max(/) = min(— /), 
and we may equivalently seek to minimize (— /). 

A similar change may be made in the constraints to reverse inequalities if 
necessary. That is, the problems 

Maximize: (c, y) 

Subject to: (a(s),y) < b(s) 

for all s t S 
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and 



Minimize: — (c, y) 

Subject to: — (a(s),y) > — 6(s) 

for all .s t S 



are identical. 

1. The Linear Program 

The case where the cardinality of S = m < oo defines a Linear Program. 
This special case of the LOP is of particular interest as it forms the basis for finding 
solutions to LOPs when the index set S is infinite. Throughout this thesis, the reader 
may assume that discussion of the general linear optimization problem permits the 
possibility of an infinite index set 5, unless explicity otherwise noted. 

Now, however, we examine this Linear Programming case to clarify the 
concept of linearity. The problem becomes 



minimize 


(c,y> 






subject to: 


(a(si),y) 


> b(si) 




1, 2, . . . , m, 


over all 


ytH n . 


(117) 



Let a : (s t ) denote the j th component of the vector a(sj). We may write the problem 
as 



Minimize c x y x + c 2 y 2 + b c„y n 
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Subject to: ax{s x ) y x + a 2 {sx) y 2 4 a n {sx) y n > b(s x ) 



01(52) 2/1 + 02(^2) 2/2 H ha n (5 2 )yn > b(s 2 ) 

Ol(5m) J/l “I” 0 2 (-S m ) t/ 2 ■)■ • • • 4" dn(^m) Vn — b[s m ) 

over all y e 7£ n . (II. 8) 

We note that in this case, we may define the feasible set by the notation 

-4 T y > b (II. 9) 

with A t 7 Z nxm , and the i th column of A is a(sj). The i th component of the vector b 
is given by f>(s,-). The linearity assumptions can be expressed, as follows [Ref. 7]. 2 

1. Proportional : The objective function is linear in the feasible set, T, 
in the following sense. Given a variable, y_,, its contribution to the objective function 
is Cjyj. So then a change of d units in y 3 results in a change in the objective function 
value of c 3 d. Similarly, the constraints are linear with respect to the variable y 3 , 
insofar as the contribution of the variable y 3 to the i th constraint is aj(si)y 3 . Then 
changing the value of yj by d units changes the value of the left-hand-side of the i th 
constraint by aj(si)d units. 

2. Deterministic : The components of the vectors a(s) and c are all 
determined, as is each scalar b(s). That is, if the components are derived from some 

2 [Ref. 7] also identifies the qualities of additivitiy and divisibility as requirement of the linear 
optimization problem. These qualities are deemed to be inherent in the qualities defined above. 
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stochastic model, their variability is disregarded, and the numbers are fixed for a 
given linear optimization problem. Having defined the Linear Optimization problem, 
we now turn our attention to exploring the utility of solution techniques to noil- 
traditional optimization problems. 
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III. 



THE EXAMPLES, A DIFFERENT 
PERSPECTIVE 



A. OVERVIEW 

This section addresses some of the basic properties of the sets from which we 
choose an optimal vector in our examples. It is the structure which we are able to 
assign to these sets that permits us to exploit the theories regarding convexity, and 
subsequently, the duality results which we derive in subsequent chapters. We then 
introduce assumptions that refine the feasible sets. 

B. LINEAR VECTOR SPACES 

Before proceding to the specific examples, we first turn our attention to the 
matter of linear vector spaces. A vector space, L, is called a linear vector space if for 
any vectors x,y,z e L and any real scalars a and ° the following results hold [Ref. 
1 ]: 

1. q(x + y) = (ax + ay) e L 

2. a(/?x) = (a/?)x 

3. x + y = y + x 

4. x + (y + z) = (x + y) + z 

For each of the example problems, the feasible set is a subset of a linear vector 
space. Consider the problem of finding an orthogonal polynomial, p n , of order n. It 
is elementary that the set of polynomials of order n form a linear vector space. The 
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same holds for the problem of finding the polynomial that best approximates the 
exponential function on [0,3]. Finally, in example three, we have specified that we 
wish to find a density function, /, from the set of all bounded, piecewise continuous 
functions with support over a compact set f2. The set of all such functions is a linear 
vector space. 

Equally important to our discussion is the concept of a norm. In general, a 
norm on a linear vector space L is defined to be a mapping, denoted |||| : L — > 7v + 
satisfying the following rules [Ref. 1]. For all x,y e L, and a t 7v, 

1. ||x|| > 0 and ||x|| = 0. x = 0 

2. j|a:r|| = | a | ||x||, 

3. ||x + y || < ||x|| + ||y || - 

Any linear vector space equipped with such a function is said to be a normed linear 
vector space. Each of the feasible sets of the examples is a subset of a normed linear 
vector space. The first two examples are clearly so. Any norm on 7v n suffice. In the 
third example, we use the infinity norm, defined by: 

WfWoo = sup | f(u) | 

u >t£l 



as an appropriate norm. 



C. REFINING THE FEASIBLE SUBSET OF THE OR- 
THOGONAL BASIS PROBLEM 

In the first example, we are interested in finding a polynomial, p n . of order n, 
such that: 

(p n ,P>) = [ PnPidx = 0, 

Jo 

for all 0 < i < n — 1, 

where the result is assumed true for all pi,p : ,i ^ j. That is, given orthogonal poly- 
nomials po,Pi , ■ ■ ■ ,Pn-\ , we se ek a polynomial of order n, orthogonal to all of the 
polynomials of lower order. We formulate this problem: 

minimize: Jq p n p t dx 

Subject to: fo PnPi > 0, for i = 1, 2, . . . , n — 1. (III. 1 ) 

Theorem 1. The optimal objective function value for the orthogonal polyno- 
mial problem is zero, and any optimal vector, p n satisfies the desired orthogo- 
nality conditions. 

Proof: Since we know triangular families of orthogonal polynomials exist, we con- 
clude immediately that the optimal objective function value is bounded above by zero. 
The constraint gives us zero as a lower bound. That any optimal vector satisfies our 
orthogonality conditions is immediate from these facts. That is, a zero objective 
function value, in conjunction with the constraint ensures orthogonality. □ 

There are infinitely many polynomials that satisfy the above criteria. Specif- 
ically, if the objective function evaluates to zero for some p n , it clear evaluates to 
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zero for a p n , for any q e 7Z. Consequently* we add the additional constraint that the 
polynomial we desire is the monic orthogonal polynomial. The additional constraint 



leads rather easily to an n x n linearly independent system of inequalities* where the 
unknown element of 1Z U is the vector whose components are the coefficients of the 
desired polynomial. 

To illustrate, let us consider the specific cases of finding the first order and sec- 
ond order polynomial satisfying (III.l). We input the zero </l order monic polynomial, 
po = 1, to start the iterative process. 

In the first order case, the objective function 



n-l 

S / Pi{x)da 

f^o Jo 



is simply 



f 1 f l 1 

/ p l (x)dx = / (x + a) dx = - + 
Jo Jo L 



a. 



The optimization problem takes on the form. 



minimize: ^ + a 

subject to: ^ + a >0, 

from which we observe that a = — and conclude that p\{x) = x — While the 
solution of this particular problem is trivial, there are some important conceptual 
principles working here. Considering the problem in terms of the linear optimization 
problem, observe that the feasible set is the set of all real numbers a. with a > — 



21 



As the function we seek to minimize takes on the form, C + a. where C is a fixed 



c .istant, we clearly wish to select the smallest possible value for a. 

Similarly, consider the formulation of the problem of finding the second order 
polynomial. We define the polynomials p 0 and pi, as above, and let p 2 = x 2 + qi x + a 0 . 
Computing the integrals, we find that 

f 1 /- 1 1 Qj 

/ P 0 P 2 = P 2 = ~ + — + Qo, 

Jo Jo 3 2 

and 

P 1 P 2 = J 0 (y X ~ 2 ) a ' lX Q °) ^ x 

= + " 5 ) x2 + ( Qo - f) x -f 

- I , fSl _ I\ , (22 _ fM _ ^2 

“ 4 + V 3 6/ + V 2 4 J 2 

= A + a. 

12 12 

Then the linear optimization problem 

minimize: E”=o /o P>Pn 

subject to: /J p,p n >0, i = 1 , 2 , . . . , n — 1 , 

becomes: 

minimize: + a 0 

subject to: ^ + |c*i + a 0 > 0, 

A + >0- (III.2) 
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As we are currently finding the feasible set, and viewing the problem in terms 
of the general formulation, we make the following observations. The index set S has 
cardinality 2. By rearranging the constraint equations, we find the constraint vectors 
are 



a (si) 

a(s 2 ) 



1 • 

12 



iT 



n T 



with b(s i) = — and 6(s 2 ) = — yr. As the vector we seek, y = [a^ a 0 ] T e 7v. 2 , we 
may illustrate the feasible set as in Figure 3. 

We observe that we have a problem of finding the optimal vector in 7Z 2 when 
seeking the second order orthogonal polynomial. This property generalizes for any 
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order of polynomial. That is, if we seek a polynomial of order n, we seek a vector, 
y c 7Z'\ giving the coefficients for the optimal monic polynomial. 

D. THE FEASIBLE SET IN THE UNIFORM APPROX- 
IMATION PROBLEM 

Consider the problem of approximating the exponential function, e*, in the 
interval [0,3] by a linear combination of polynomials. We have specified that we wish 
to find the combination that minimizes the maximum residual over the interval, and 
not the total residual. Hence, we are not solving the least squares problem, where 
orthogonality of the approximating functions dramatically simplifies the task. With 
the uniform approximation problem, however, orthogonality of the polynomials is not 
particularly useful. Therefore, rather than using the orthogonal polynomials above, 
we merely specify the degree of the approximating polynomial. Thus we seek a linear 
combination of the polynomials 

Xr=o ctiPi(t), 

where Pi(t) = t', i = 0, 1, 2, . . . n. 

Consider the specific example for n = 1. We seek a polynomial 

(T,y), where T = [l,t] T , and y = [q 0 , ai] T e7£ 2 . 

Since the vector T is fixed, the problem is equivalently one of determining the optimal 
vector, y e 7Z 2 . We summarize with a preliminary statement of the problem. 
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minimize: max (t [ 0 , 3 ] | (E"=o Q «^) ~ e * I 

over all y = [q 0 , . . . , a n ] T c 7v n+1 . 

Observe that the objective function is non-linear in the decision variables, a t , i = 
1 Also observe that the feasible set is 7Z n+l in its entirety. That is, any 

combination of real coefficients is feasible, since there are currently no constraints. 

E. CONVENTIONS OF IMAGE RECONSTRUCTION 

For Example 3, we have specified that we wish to find some function, /, defined 
on the simply connected, compact set C F? . Assume that 0 is a circle of radius 
1. We also assume that the function that we seek is piecewise continuous on fi. The 
piecewise continuity restriction is justified by the physical nature of the problem we 
are solving. We call the space of such functions F. Here it is useful to define a basis 
for F, and we select a logical basis in view of the problem we wish to solve. 

As we have stated, the the formal inverse of the Radon transform is well 
defined. Our difficulty results from our inability to compute the uncountably infinite 
number of line integrals defining the Radon transform. This difficulty stems first 
from the fact that the region over which an X-ray is measured is not one-dimensional. 
That is, the region over which the X-ray measures mass has both width and length. 
Each X-ray measures the density of the medium over some strip, as in Figure 4. 
Additionally, the number of data points from which one reconstructs an image is 
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Figure 4. A Single Density Measuring Strip 



finite, rather than uncountably infinite, as required for formal transform inversion. 
A more accurate perspective from which to view the data obtained by the X-rays is 
presented here. 

Begin by fixing an angle (f>. We associate with each strip of <f>, a label (<f>, i). 
We introduce the strip characteristic function, 7. Define the real valued function 7 
defined on fl by the rule 






1, if uj lies in strip (<£, i) 

< 



0, otherwise. 

Then an integral defining the sampled Radon transform, for a fixed angle, 0, and a 
fixed strip, (<£, i), becomes 



f*,i = / /M 

Jo. 
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Let us define a view to be the set of all strips for some fixed angle, We impose two 
restrictions. First, we require that all strips of a view are non-overlapping. Mathe- 
matically, if (4>,i) and (<p,i) correspond to two strips of the same view, 

7 *,«(w) = 0, for a11 * 7 ^ J, 

for any u e ft. 

Second, we require that the strips composing a view completely cover the 
compact set. That is, for any u> e fl. and every angle <f>, there exist some strip (<f>, i) 
such that 

7 *A U ) = L 

See Figure 5 for a graphical presentation of these properties. 



27 



Assume that we have some manner in which to control the width of the strips. 
Then we may select some number of strips of equal width for each view. Identifying 
the number of strips for view <f> as n#, and the width of a strip for view (p as 6^, we 
may conclude that: 

xb < t > = 1, the diameter of fl. 



For a finite number of views, N v , the application of this convention partition 
the set fi into a finite number of polygons. We call the set of these polygons a 
polygonal partition of fl. Figure 6 illustrates the manner in which these polygons are 
formed. With each of the resultant polygons, Sj, we associate a scalar, area(sj), and 
a characteristic function, 




if u e Sj, and, 



0 otherwise. 

It is the set of these characteristic functions, V’j that we use as the basis for the 
function space, F. 



Theorem 2. For any continuous function , g defined on f), and any e > 0, 
there exist some polygonal partition on n polygons, and some function 

f = it, 

:=] 



such that ||/ — g|| OO ^ £• 



Note that we may write ||/(w) — with the equivalent notation, 



ll/M -sMIloo 



max 

/ = ,71 



{max | {f{u) -g(u))xhj |j . 
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Figure 6. Polygons Created by the Views '-r-, {? = 0, 1, . . . ,4}, Each with 4 Strips . 



One may easily verify that ||/(u>) — <7(w)||oo is a norm. Note that we may use the 
maximum over j rather than the supremum, as the polygonal partition is a finite set. 
The properties of our function space, F, allow us to use the maximum rather than 
the supremum over each polygon, s : . 

Proof: Let g be any continuous function in F, and let e > 0 be given. As g is 
continuous, there exists some S > 0 such that 



||(x, y ) - (p, < 5 implies that 

\\g( x ,y) -g{p,q)\\oo < e. (in.3) 



We use only two angles, <pj = 0 and <p 2 = Let n ^ = n^ 2 = [j], Note that this 
implies that 



n — x n^ 2 = 
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as in Figure 7 . Further, for any two points, ( x,y ) and (p, q) in a fixed polygon, 
ll(z.y) - (P.9)l|oo < & 

Let / = 5 Z" =1 aitpi. We now consider ||/ — p||oo- 

11/ ~ y||oo = max { max | (f(u) - g(u))^ : | 1 

j = ^ u JCSj J 

= max | max | jr ajV’.V’j - I i 

j=\,...,n y uttSj ^ J 

= max \ max | ajtpj - g(u) 1 >j | i 
= max { max I a,- — q(u > ) I > . 

Since <7 is continuous and each of our polygons is compact, g achieves its maximum 
and minimum on each square. For each square, Sj, define Mj = max^^ g{u), and 
m 3 = min^j g(w). Choose 

M-j + rrij 
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Using the continuity of g to invoke the intermediate value theorem, there exists some 



Co e Sj such that g(Co) = a 3 . Further, we know that u> e s 3 => 1 1 ^ ^ 1 1 oo < S- Therefore, 

for any square, s,, 



While the above proof uses only two views, one may increase the number of 
views, or insist on narrower strips in the partition of Q. Clearly, such a refinement 
can not degrade the approximation of the function g , but only maintain or improve 
it. We may, at worst, maintain the same constant values over the new polygons that 
they were assigned over the coarser partition. 

We now demonstrate the utility of defining a basis for F. Let k = J2<t> n <t>- 
That is, let k denote the total number of strips defining sample transform. For any 
polygonal partition P on n polygons, the sample Radon transform with respect to P, 
which we denote //>, may be written as 



max | g(Co) — g(to) | < £, implying 



I a j “ g{w) | < £, for j = 1, . . . , n. 



Therefore 




□ 



Ip = -4 T y 
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where A is an n x k matrix, and y t lZ n . The matrix A is given by: 



0, if 7j(u,’) = 0 for all u j 

area(s ; ), otherwise. 

That is, the represents the area of the i th polygon if the polygon falls within strip 
j. The i th component of y is the mean density of the function / over polygon i. 

For any fixed polygonal partition, the feasible set is a subset of the infinite 
dimensional vector space, F. Each element of the subset may be thought of as a 
vector in 7 Z n . Without further restriction, the feasible set becomes the set of all 
vectors, y c lZ n such that A T y = fp. We exploit many of the subsequent theorems as 
a result of the ability to translate the problem into 7Z n . 
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IV. 



CONVEXITY 



A. OVERVIEW 

In this chapter, we investigate the concept of convexity, both as it pertains 
to sets and to functions. The primary motivation for this investigation comes from 
the fact that we may, when certain convexity conditions are met. conclude that local 
maxima and minima are global. Stated differently, we may eliminate a portion, often 
a large portion, of our feasible set from consideration when attempting to find the 
optimal value of our objective function. This chapter lays the groundwork for our 
investigation into duality, contained in the following chapter. 

This and the following chapter form the foundation for linear optimization, 
and, consequently, the concepts and results herein may be found in most elementary 
texts on the subject. The material in this chapter is taken primarily from [Ref. 8] 
and [Ref. 9], to which the reader is referred for further study. 

B. CONVEX SETS 

Let us return briefly to the image reconstruction problem. Consider two ar- 
bitrary functions, /, g t F, the space of bounded, piecewise continuous functions on 
the compact set, Q. Select some arbitrary value for a parameter, A. We require that 
A t [0, 1]. Consider the function, 

h(u) = A f(u) + (1 - A)p(^'). 
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First note that as both / and g are defined on Cl, so is h. As / and g are bounded on 
a compact set, M = max {sup n /(w), sup n g(u>)} is well defined. We know that 

h(u) < AM + (1 — A )M, implying that 
h(u ) < M, for all u Cl. 

Consequently, the function, h is in F. The important items to note here are that f.g, 
and A e [0.1] were each chosen arbitrarily. We conclude, then, for any two elements 
f,geF and for any A e [0, 1] the function, 

h = A / + (1 — A )g e F. 

The above example proves that the set F is a convex set. A set C C L, a linear 
vector space, is called convex if for any two elements y,z e C and A e [0, 1], 

x = Ay + (1 — A)z e C. 

Any element y e C of the form y = £"=i ^<y>) 23"=i 0 < A; < 1 is 

called a convex combination of yi,y 2 , . . .y n - This convex combination is called strict 
if A, e (0, 1) for all i. That is, the convex combination is strict if A, ^ 0 or 1, for all i. 
We now examine a fundamental characterization of convex sets. 

Theorem 3. [Ref. 8] Let C be a convex subset of L , an n-dimensional linear 
vector space. Every convex combination of the vectors of C is an element of 

C. 
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Proof: For n = 1. the claim is trivial. Assume that the statement is true for 7' < 7 ? — 1 
where n > 1. Now we consider some convex combination 



n n 

y = ]T Ajy,, where y-, c C, ]T A; = 1, A; > 0. 

i=l i=l 

If A„ = 1, then we are done, so we suppose that A n ^ 1. Define 



n— 1 

A = £ A,-, and A' 

i=i 



A, 
A ' 



Then 

n — 1 

y = A^A|y ; + A n y n . 

i=l 

Note that sum of the first term satisfies the conditions of the inductive hypothesis. 



That is. 



We conclude that 



n— 1 

^2 K = 1, and a; > 0. 

1=1 



y = A lyi j e C ’ and 



y = Ay + A n y n . 



Now consider the expression: 



A + A n 



n — 1 

= ^22 A * + A « 

t=i 



= EA, 



= 1 . 
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Then by the definition of a convex set, y e C. 



□ 



Let A e 7 Z nXm , and let b e 7v m . Then it is elementary that the sets 

G i = {x : A t x = b}, and 
G 2 = {x : A t x > b}, 

are convex. We prove the case of G i- 

Proof: Let Xi,X 2 e Gi. Then xi,X 2 e R m , and A T Xi = A T X 2 = b. We select some 
value for A e [0, 1], and consider: 

A t (Axi -f (1 — A)x 2 ) 

= AA t xi + (1 — A)A t x 2 

= Ab + (1-A)b 

= b. (IV.l) 

□ 

One may show G 2 is convex with an identical argument. Note that the set, G 2 
defines the feasible set of the linear program. 

C. HYPERPLANES, POLYHEDRAL SETS, AND EX- 
TREMA 

A hyperplane H in 1Z n is a set of the form {y : (p,y) = k} where p is a 
nonzero vector in lZ n , and k is a given scalar. It is easily shown that the hvperplane, 
//, is a convex set. A hyperplane divides 7v" into two (non-disjoint) regions, called 
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half-spaces ; one is defined by {y : (p,y) > k} and the other by {y : (p,y) < k, } 
both of which are again convex. Note that the intersection of a finite number, m, 
of half-spaces, called a polyhedral set , is also convex, since the intersection may be 
interpreted as {y : A T y > b} where the i th half-space is define as the set 

{y : (a>,y) > bi}. 



That is, A is an m x n matrix whose columns are the vectors defining the half spaces. 
To illustrate this point, we consider a simple example. Define the vectors 





0 




1 




i 


a l = 




, a 2 = 




, and a 3 = 






1 




-1 




0 



We use the above vectors to define the three half-planes in 1Z 2 , 

(ai,y)>-2, <a 2 ,y) > and <a 3 ,y) > -^ 



Using the above convention for identifying the matrix A , and the vector b, we find 



that 



A = 



0 1 1 

1 -1 0 



, and b = 



_9 



We may identify the intersection of the half-planes as the set of vectors, y in 7Z 2 



satisfying the equation, 



A T y > b or, 



0 -1 

1 -1 

-1 0 



y i 

2/2 



> 
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Figure 8. The Polyhedral Set of the Example 

The intersection of these half-planes is illustrated in Figure 8. 

We are interested in simplifying our optimization problem by eliminating por- 
tions of the feasible set from consideration. A critical tool in this reduction results 
from the notion of an extreme point. We here define an extreme point, and use the 
concept to further characterize the convex sets with which we are working in the 
example problems. 

Let C be a convex set. We call yt Can extreme point of the set C if it can 
not be represented as a strict convex combination of the elements of C. Alternately, 
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y 2 




the point y is an extreme point if, and only if, for any A t (0, 1), and for any x,z e C, 

y = Ax + (1— A)z implies that 
y = x = z. 

Geometrically, a point y in a polyhedral set C is an extreme point if lies on some n 
linearly independent defining hvperplanes of C, where n is the rank of matrix .4 r , as 
formed above. Two extreme points are adjacent if the line segment joining them is 
an edge of C . That is, the line segment joining them is formed by the intersection of 
some n — 1 linearly independent defining hvperplanes of C. See Figure 9. 
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Theorem 4. [Ref. 7] Let C be a polyhedral subset of 7v n . If C is bounded, then 
C has at least n + 1 linearly independent defining hyperplanes. 

This theorem is offered without proof. However, its validity for the case of 
n = 2 is illustrated in Figure 9, where the polyhedral set in 7Z 2 has three independent 
defining hyperplanes. An immediate consequence of the above is the following: 

Theorem 5. Let C be an arbitrary bounded convex subset of 7v n . C has at 
least n extreme points. 

Proof: Suppose that there are fewer than n extreme points of C . Since any n linearly 
independent hyperplanes must intersect in a single point point in 7 Z n , there are fewer 
than n + 1 linearly independent hyperplanes, and C is unbounded. □ 

D. CONVEX FUNCTIONS 

We now introduce convex functions, and their primary characteristic with 
which we are interested. This introduction is cursory in nature. For a more de- 
tailed exploration of convexity with respect to functions, the reader is referred to 
[Ref. 8] and [Ref. 10]. 

Let C C TV 1 be a convex set. A function /, defined on C, is said to be convex 
if for any elements x,y t C, and A e [0, 1]: 

/(Ax + (1 - A)y) < Af(x) + (1 - A)f(y). 

If / is convex, then — / is said to be concave. Linear functions are, thus, both convex 
and concave. Having alluded to the utility of convex functions, we state an important 
result formally. 
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Theorem 6. [Ref. 8] Let f be a convex function defined on a closed convex 
set , C C 7 v n . Then a relative minimum of f over C is a global minimum. 



Proof: Let / have a local minimum at yi, and a global minimum at y 2 . with /(yi) > 
f(y 2 ). Let A e (0,1) be given. Because / is convex 

/(Ayj + (1-A) yi ) < A/(y 2 ) + (l-A)f(y,). (IV.2) 

Also, since it is assumed that /(yi) > f(y 2 ), we conclude 

A/(y 2 ) + (1 - A)f(yi) < A/(yi) + (1 - A)f(yj) 

= /( yi). (IV.3) 



We now define N t (y\) = {y t 7c n : |Jy — y j |J < e}. That is, we define an e neighborhood 
about the point, yi. If 

0 < A < and, 

llyi -y 2 \\ 

y = Ay 2 + (1 - A)y l5 

then y c N ff (y!). Then 

/( y) = /(Ay 2 + (1 - A)yi) 

< A/(y 2 ) + (l-A)f( yi ) 

< A/(yi) + (1 - A)f(yi) 

= /( yi), 



contradicting IV.3, and the fact that / has a local minimum at yi. We have shown, 
then, that only absolute minima are possible. □ 
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If the objective function is convex (which it must be since we are considering 
only linear objective functions), we can be sure that we have found an optimal vector 
if it is locally optimal. This fact forms the basis for the Simplex algorithm, which we 
explore in the following chapter. 

Theorem 7. If an optimal solution to the Linear Program exists, that is, if 
min {/(y)} exists and is finite for some y in the feasible set, C C 'TP', then 
there is an optimal extreme point. 

Proof: Let y e C be an optimal vector, but not an extreme point. Let a linear 
objective function / defined on the polyhedral set C be given. Since | / |< oo at 
an optimal vector, one may clearly add sufficient number of hyperplanes to bound 
the feasible set if it is not already bounded, without changing the optimal solution. 
Assume that / is optimal at y. We consider two cases. 

Case 1: The vector y does not lie on an edge of C. 

We first recognize that y can be written as a convex combination of the extreme 
points of C , since there are at least n linearly independent extreme points. Let 
E = {e : e is an extreme point of C}. Let E have cardinality r. Then we may write 
y = 23[=i A; e j . The linearity of the objective function, /, implies /( y) = £[=1 Aif(e;). 
Let ej be some extreme point such that /(ej) > 0, and let us decrease the value of \j 
by S > 0 units. We may do so without leaving the feasible set since we are not on an 
edge. Note that if no such extreme point ej exists, then we may increase the value of 
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A^, and the argument still holds. Call the new element of the feasible set y'. Then 

/( y') = / ^A,ei + (Aj -<J)ejj , 

= / + Ajej - £ej)J 

= /( y-^ej) 

= /(y)-tf(ej) 

< /( y), 

implying that y is not the optimal vector, a contradiction. Hence if y is a non-extreme 
optimal vector, it must lie on an edge of C. 

Case 2: The vector y lies on an edge of C. 

Since y is not an extreme point, but is on an edge, it is on the line segment 
joining two extreme points, ej and e 2 of C, and may be written as y = Aei -f ( 1 — A)e 2 , 
for some A e (0,1). Parameterize the line segment between the points by the 

equation y(t) = tei -f (1 — t)e 2 , as t : 0 — > 1. Fix some t e [0,1], , and let y' = 
(1 — t)ei + te 2 . Then 

/(y')-f(y) = (1 — 0/( e .) + tf(e 2 ) — Af(ei) — (1 — A)f(e 2 ) 

= -(<-(l-A))/(e,) + (t-(l-A))f(e 2 ) 

= ( i -(l-A))(-/(e 1 ) + f(e 2 )) 

> 0, for all t, since y is the optimal vector. 

Since y is not an extreme point, it can be represented as a strict convex combi- 
nation of ei and e 2 . Therefore, we may choose some it (0.1). such that 
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t > (1 — A). Then 



i— (1 — A) > 0, implying 
— /( e i) + f( e 2) > 0. Therefore, 

/(e 2 ) > /(e,). 

An identical argument yields the result that /(e 2 ) < f(ej). We conclude that /(ej) = 
f(e 2 ), and that any t c [0, 1] results in an optimal vector. Choosing t = 0, or t = 1 
places us at an extreme point. □ 

An alternate proof may be found in [Ref. 7]. 

E. AN ASIDE: THE CONVEX HULL 

We desire to work with convex subsets of linear vector spaces, as they have 
useful characteristics when we attempt to solve more general optimization problems. 
However, there is no guarantee that an arbitrary set is convex. For such cases, we 
define the convex hull of an arbitrary set A C L, denoted Conv(A), as the set of all 
possible convex combinations of the elements of A, where L is a linear vector space. 
An example of a convex hull of a non-convex set in 7Z 2 is displayed in Figure 10. 

Clearly, if A is convex, then Conv(A)=A. The intuitive notion that the convex 
hull of a set, A C L is the smallest convex subset of L in which A is contained, and 
conversely, are easily proven theorems (See [Ref. 8]). 

The real utility of the convex hull stems from the fact that any element of 
Conv(A) may be written as a convex combination of the elements of A. Generating 
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Extensions required to form convex hull 




Figure 10. Forming the Convex Hull 



the convex hull does not add any new extreme points. This is offered without proof. 
The interested reader should consult [Ref. 8]. Consequently, if we are solving an 
optimization problem with a linear objective function on a non-convex set, .4, then 
solving it over the convex hull of the set A , rather than over the set A , itself, does 
not change the solution of the problem. 
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V. 



DUALITY AND THE SIMPLEX 
ALGORITHM 



A. OVERVIEW 

The concept of duality makes it possible for us to bound the optimal value for 
the objective function, /, and in many cases, to solve the LOP more efficiently. As 
before, let c be a vector in lZ n . Let S be an arbitrary index set. We have previously 
stated that for every s c S, we associcate a vector a(s) in 7£ n , and a scalar b(s). The 
general form of the linear optimization problem is: 

minimize: (c,y) 

subject to (a(s),y) > 6(s), for all s e S 

over all y e 7l n (V.l) 

We know that we achieve an upper bound for the optimal value of the pref- 
erence function as soon as we find an element of the feasible set. However, we have 
no such simple criteria for determining a lower bound. Intuitively the prospect of 
finding some feasible vector is less daunting than solving the problem. Using duality 
allows us to form an associated optimization problem, find a feasible vector in the 
associated problem, and use the feasible vector to derive a lower bound of the origi- 
nal problem. The associated optimization problem is called the Dual. In some cases, 
we may bound the original optimization from below arbitrarily well using the dual 
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problem. We refer to the original linear optimization problem as the primal. P. I he 
primal, P. and its associated dual. D. are referred to as a Dual Pair . 

Define the value of a LOP to be the optimal objective function value. We seek 
properties that allow us to approximate the solution of a linear optimization problem 
arbitrarily well, and to determine when the optimal value of the linear optimization 
problem and its corresponding dual are the same. 

This chapter, in conjunction with the previous chapter, forms the fundamental 
principle underlying the Simplex algorithm. The reader is again referred to [Ref. 7] 
and [Ref. 9] for more detailed descriptions of the material of this chapter. 

B. WEAK DUALITY 

We begin with the generic linear optimization problem, (V.5). The first the- 
orem that allows us to bound the problem from below is stated here. Note that we 
allow for an infinite index set 5. 

Theorem 8. The Duality Lemma [Ref. 9] Let the finite subset 

(Z. 5, 

and the non-negative vector 

x = [x],x 2 ,...,x q ] T 

be such that: 

c = a(s!)xi + a(s 2 )x 2 H + a(s q )x q . 

Then for any feasible vector y = [yj,y 2 , . . . ,y„] T in the feasible set of the 

optimization problem, P, 

6(5j)a:] 4- 6(s 2 ):r 2 + •••-(- b(s q )x q < c T y. 
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Proof: Since y is a feasible vector, 



(a(sj), y) = a(s;) T y > b(sj), for i = 1 , 2 , . . . , q. 



Further, since x t > 0 by assumption, 

E *(*)*.■ < E ( a ( s i) T y) x u 

i=l i=l 



implying, 




y 



□ 

As an example, consider the problem of finding the monic second order poly- 
nomial, p 2 , orthogonal to both, p 0 = 1, and p\ = x — Recalling from Equation 
(III. 2), the primal of this problem is 



minimize ^ + j^on + o 0 
subject to: ^ + cto > 0, 



12 + i 2 ai - (V- 2 ) 

Disregarding the constant in the objective function does not affect the choice of an 
optimal vector. Consequently, the optimal vector for (V.2) and the LOP 



mininize: (c,y) 

subject to: (a(s!),y) > b(s i) 

(a(s 2 ),y) > b(s 2 ) (V.3) 
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where 



7 

12 


II 

c/> 


i 

K5|«-> 

1 


• a(s 2 ) = 


1 

12 


1 




1 




1 

O 

i 



6 ( 5 !) 

and 



\ 

3 



K S 2) ~ — 



12 



are the same. 

Attempting to satisfy the hypothesis of the duality lemma, we seek a non- 
negative linear combination of a^) and a(s 2 ) that sums to c. That is, we seek a 
non-negative solution to the equation: 

1 j_ 

2 12 

1 0 

Clearly, the only such vector satisfying the equation is the vector x = [1, 1] T . Conse- 
quently, the optimal value of the primal problem of V.3 can be no better than 













3-1 





7 

12 




*2 




1 



^1 6(si ) + X-2 b(S2) 



- (“!) + (“A) - ~ 12 - 

Because the optimal vector of (V.2) must be the same as that of (V.3), the value of 
(V.2) is bounded below by 



_5_ 

12 




as expected. 



C. THE DUAL 

Having stated the duality lemma, we move to a formal definition of the dual, 
and similarly, the dual pair. We begin with the special case of a Linear Program. 
The dual of a linear program 
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minimize: (c,y) 

Subject to: A T y > b 

is defined to be 

maximize: (b,x) 

Subject to: Ax = c 

with X{ > 0, for i = 1, 2, . . . q. (V.4) 

Note that the dual of an LP is an LP itself. To be feasible above, we require a 
non-negative linear combination of the constraint vectors to sum to the vector c. The 
vector f becomes the objective vector in the dual. These facts highlight the difficulty 
of defining the dual of an infinite LOP. Because of the difficulty of computing infinite 
sums (possibly uncountably infinite), we require a variation of the dual for the infinite 
case. 

Recall the generic LOP 

minimize: (c,y) 

subject to (a(s),y) > b(s), for all s c S 

over all y e 7£ n . (V.5) 

As it proves useful in the statement of the dual, we write (V.5) in the alternate form 

minimize Er=i CrJ/r 
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subject to: J3"=i a r( s )l/r > b(s) 



for all s e S. (V.6) 

The dual optimization problem, D , is defined to be: 

Find a finite subset {si, s 2 , s 9 } C 5, and the non-negative numbers, a*!, ;r 2 x q , 

such that the expression: 

9 

t=l 

is maximized, subject to the constraints 

9 

X{O r (<S,' ) — cv, 

t=l 

for r = 1,2, . . . , n. (V.7) 

Tha* is, the dual of the infinite dimensional LOP is to find some optimal 
finite subset of the index set, and then solve the resulting LP dual. In keeping with 
convention, we call the process of taking a finite subset of an infinite set discretizing. It 
is important to note that the dual is, in general, a non-linear problem in 2 q variables, 
since both the discretization and values for the coefficients, x,- are unknown. However, 
once we have chosen a subset of 5, the problem is linear in the unknowns x r . Further, 
one might suspect that if a sequence of discretizations of S is chosen systematically, 
then we may be able to arrive at an acceptable approximation of the solution of 
the associated primal problem, assuming one exists. That is, we may get arbitrarily 
close to the solution of the dual problem, and consequently, find an arbitrarily good 
approximation of the solution to the infinite dimensional primal optimizaton problem 
by solving a sequence of Linear Programs. This is a basic premise behind solving 
infinite dimensional linear programs with the Simplex algorithm. 
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D. APPROXIMATING THE EXPONENTIAL FUNCTION 



The problem of approximating the exponential function with an n lh degree 
polynomial is now analyzed more closely. Of particular interest is how duality results 
enable us to determine the relative quality of a given approximation, and how they 
allow us to bound the error in the problem. 

1. The Primal Problem 



Recall that we stated the problem of approximating the exponential 
function over the inverval [0, 3] as 



Determine the polynomial 

/(<) = Q **‘ 

1=0 

that mininimizes the expression 

sup | f(t) - e' | . 

fe(0,3] 

Let us formulate this problem in terms of the standard linear optimization problem. 
We relabel the index set T vice S and define it to be the interval [0,3]. Realizing that 
the objective function above is a scalar valued function, as a first step we reformulate 
the problem as 



minimize: a n+ i 

subject to: | X"=o (<*«*') — e* | < a n+1 , for all t c T. 

Eliminating the absolute values, we replace each constraint with the equivalent pair 
of constraints, 

n n 

— a,t' + e‘ > — Q n+ i and, ^ a,t' — e‘ > — a n+1 . 

t=0 i=0 
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Rewriting, we arrive at 

n 

— ^a,f + a n+ i > — e' and, 

i=0 

n 

oL{t x + a n + 1 > e*. 

t=0 

Thus, each element of the index set T has two associated constraint vectors. Let 
y = [a 0 , aj , . . . , a n +i] T . We have, for each i c T, a vector 



a(t) 




...,t n , 1 



T 



and the two constraints 



— ( a (t), y) > -e ( , and 

( a (t),y) > e*. 



It proves useful in the formulation of the dual problem to distinguish the two con- 
straints associated with each t c T. As a notational device we identify the vectors 



1 




-1 


t 




-t 


t 2 


, and a(t ) = 


-t 2 


t n 




-t n 


1 




1 



It is important to note that the use of functional notation for the vectors a(t + ) and 
a(t") is used for convenience only. No such functional relationship exists, as there 
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are two constraint vectors for each t c [0,3]. We distinguish the vectors by labeling 
two sets, T + and T~ . Note that T + = T~ = [0.3]. 

Similarly, for each t e T, we have the scalar, b(t + ) = e\ and b(t~) = —eh 
We finally identify the objective function vector, c = [0, 0, . . . , 1] T c / R n+2 . The final 
formulation of the primal problem, P, is 



minimize: c T y 




subject to: a(t + ) T y 


> Kt + ) 


a(t-) T y 


> b(t~) for all t e T 


over all 


y e 7l n+ 2 . 



2. The Dual Problem 

Having put the primal in the desired form, we turn our attention to the 
dual. Referring to the general form of the dual as in (V.7), we seek the finite subset 
T = C T, and the vector, x t 7Z q , that maximizes the expression 

S x Mu) 

T — 1 

while satisfying the constraints 

<7 

^x,a r (f t ) = Cf , for r = l,...,n. 

!=1 

First make the substitutions b(t{) = e 4 ', and a r (N) = t r x . As we have 
defined the set T to be T + UT”, the above formulation is equivelent to the following. 



Find the subsets 

f + = cT + , 
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and 



f- = {*r, < 2 ,... ,<,"+} c t~ 

and non-negative scalars xj 1 ", xj, . . . * and xf, , . . . x~_ with which to as- 

sociate each element of the respective sets, that maximizes 






U=i e 



~ n=i 



and satisfies the constraints 

n + 



Z4 (<f) r -E<('r) r 



t=l 



1=1 



S ^ + J2 x i 

i=l i=] 



= 0, for r = 0, 1, . . . , n 
= 1. 



(V.9) 



The formulation (V.9) may be written in the simpler form 



maximize: Y?i=i e<1;r > 

subject to: Y)i=i x t t r t = 0, for r = 0, 1, . . . , n 

ELi I *.!<!• (v.io) 



where t, t [0,3] for all i. The problems are equivalent in the respect that one may 
derive from a feasible solution of one a feasible solution to the other. The proof for 
this statement may be found in [Ref. 9]. 

3. Qualitative Analysis of Solutions 

We begin by restating the duality lemma in the terms of the uniform 
approximation problem. 



Theorem 9. [Ref. 9] Let the finite subset T C T, and the real numbers 
X\, X2 , . . . , x q be feasible for the dual problem of equation (V.IO). Then the 
following holds for any y e 7£ n+l : 



E 

1=1 



x t e 



< sup 
UT 



X>r - e' 



r= 0 



(V.ll) 
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As this is a direct consequence of the duality lemma, it is not proven here, though 
the proof may be found in [Ref. 9], 

Let us consider the problem of approximating the exponential over T 
with a quadratic polynomial. Then from (V.8), the objective function vector c is 
equal to [0,0,0, 1] T . With each t + and t~ e T = [0,3], we associate the vectors and 
scalars 

a(t + ) = [t + ° t +1 t +J lj and b(t + ) = e t+ , 



and 



(t ) = — t — t — t 2 lj and b(t ) = e 1 



respectively. The dual problem, from equation (V.10), is to find the set {fi, • • • , Lj} 
T C [0, 3] and associated non-negative scalars that maximize 






1=1 



while satisfying 



<7 

7: Xjij = 0, for r = 0,1,2, and 

i=i 

i>.l < 1. (V.12) 

«=1 
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Let us arbitrarily choose the subset T to be {0, 1,2,3}. Hoping to apply the restated 
duality, Theorem 9, we first require a solution to the equation: 



1111 
0 12 3 
0 14 9 






x 2 



X3 






o 

o 

0 



(V.13) 



Every such vector is of the form x = [— a, 3a, —3a, a], where a is an arbitrary 
real number. Scaling in order to satisfy the constraint, | x , |< 1, we let 

x = f— |, |, — |, | . The hypothesis of the Theorem 9 satisfied, we conclude that 

the best quadratic approximation to the exponential function over T = [0, 3] in the 
uniform norm sense, differs from e ( by at least: 



tl x x e u = 



-| e ° + l e ' - § e2 + i e3 



.6340. 



t = l 



E. STRONG DUALITY 

Consider the three different possibilities we may encounter in the solution of 
the Linear Optimizaton Problem. Referring to the optimal objective function value 
of the minimization problem as V(P), and to the optimal value of the dual as V(D), 
we list the possible conditions, or states, of the problem as follows [Ref. 9]: 



Inconsistent: (1C) The feasible set is empty, so that no 
solution is possible. 

Bounded: (B) There exist at least one feasible vector, and 

among such feasible vectors, at least one is optimal. 
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Unbounded: (UB) There are feasible vectors such that 

the objective function may be made arbitrarily small. 

A duality gap is said to occur when V(P) ^ V(D), that is, when the optimal 
values of the dual pair are not the same. We hope to find general conditions that 
preclude the existence of a duality gap. Theorems that allow us to disregard the 
possibility of a duality gap are called strong duality theorems. 

1. The Dual and Convexity 

We briefly characterize the dual problem as it relates to our discussion 
of set convexity. Before continuing, we require the definition of the Convex Cone. Let 
C be a convex subset of 7l n . The convex cone of C, denoted y((7), is defined to be 
the set of all vectors y e 7Z n , such that y = Ax, where A > 0, and xeC. 

In Chapter IV we constructed an example of a polyhedral set using the 

vectors 





0 




-1 




1 


a (si) = 




, a(s 2 ) = 




, and (s 3 ) = 






1 




1 




0 



The resultant polyhedral set is illustrated in Figure 8. The darkened region of Figure 
1 1 illustrates the addition to the set, that together with the original polyhedral set, 
forms the convex cone. The darkened portions of the Figure extend to infinity. 

Consider the specific case of the convex cone of the constraints of the 
linear optimization problem. We have expressed the constraints by (a(s),y) > b(s), 

for all s e S. Define A s = {a(s) : s e 5} C PC . We know that A s is convex from 
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Figure 11. Formation of the Convex Hull. 

equation (IV. 1). We refer to the convex cone of A s as the moment cone of the 
optimization problem, P, and denote \(A S ) by M n . 

Having defined the moment cone, we arrive a fundamental characteri- 
zation of the dual problem, D. 

Theorem 10. [Ref. 9] The dual problem, D , is feasible (i.e. the feasible set is 
not empty) if and only if c e M n . 

The proof may be found in [Ref. 9]. The result follows directly from 
the definition of the dual. An alternate interpretation of this result is as follows. The 
dual problem is feasible if and only if we may express the vector c as a non-negative 
combination of the constraint vectors of the linear optimization problem, P. 

The following is a generalization of the theorem that allows us to express 
every element of the convex set, A , as a convex combination of the extreme points. 
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The theorem proves vital in the discussion of the Simplex algorithm, as it allows us 
to bound the required number of elements, s q e 5, in the discretization of our index 
set when forming the dual. 



Theorem 11. The Reduction Theorem [Ref. 9] Let the vector z e 7v n be 
a non-negative linear combination of the vectors , Z],z 2 , • • . , z q . That is, 

q 

Z = £x iZi , 
i=l 

with X{ > 0 for all i. Then we may also write: 

q 

z = ^x| Zi, with xi > 0, 

i=l 

where at most n of the numbers x\ are nonzero. Moreover , the set of vectors 
{zj} corresponding to the nonzero scalars x\ are linearly independent . 



Proof: We first note that if Zi,z 2 ,...,z q are already linearly independent, then 
q < n, and the initial representation of z already satisfies the theorem. Assume, 
then, that q > n, and, consequently, that the vectors, Zi,z 2 ,...,z q are not linearly 
independent. Then we know that we may write 

<? 

J2 a ' z > = 

:=1 

where at least one ^ 0. For any r : a r ^ 0, we have: 



— Z i‘ 
a r 

i^r r 



(V.14) 



Substituting into the equation of our hypothesis, we have: 



“ £ ( Xi " X 'S) Zi - 
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We have, then, a representation of z by a linear combination of q — 1 of the vectors. 

Zj. We must show, then, that the expression (x, — may be made non-negative, 

for i = 1 , 2. . . . , r — 1 , r + 1 , . . . , q. Select some a r > 0. We can clearly do so. as if 
all q, are negative, we may multiply by —1 and still have the desired result that 
Y^ q ,=i —<*i z i = 0. Then in equation (V.14), if a, < 0, we may conclude that 

x , — x r — > 0, 

since x,, j and a r are each nonnegative. 

We now consider the case that o,- > 0. Then we must show that > • £t . 

We may accomplish this quite simply, by selecting the r that minimizes the expression, 

^ over all a r > 0. We have expressed z as a non-negative linear combination of q— 1 
of the vectors, Zj,Z 2 ,. . .,z q , and may continue inductively until we have the desired 
result. □ 

The reduction theorem yields this immediate result. Let S = {sj , . . . , s,} C 
S. and the set of non-negative numbers {xj, . . . , i,} be feasible for the dual problem, 

D. That is: 

9 

^ x,a r (s t ) = Cr , 

1=1 

for r = 1 , 2, . . . , n. Then there is a subset, S' = {s M , . . . , s, n } and a set of non-negative 
numbers, {x( , ...,xj n } that is also feasible for D. Note that we have not included 
the objective function of the dual in our reduction above. It is not necessarily true, 
then, that we need only to consider discretizations of S with cardinality n. That is, 
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let us reduce the non-negative linear combination 

<7 

^x ? a(si), where q > n 
2=1 



to the combination 

E *:«(*)> 

t=i 

where no more than n of the scalars, x f are non-zero. Then it may be that 

E x ib( S i) ^ E X 'M S i)- 

»=1 i=l 

Consequently, we include the optimal objective function value in the set of equations 
for reduction. This convention requires that we define a new moment cone, which we 
call, M n+i . 

v„ +1 = x(A') 



where A is formed by the vectors, 



a # (s) = 



b( Si ) 

ai{s t ) 

a n (si) 



c n n+ \ i = 1,2,. ..,n. 



The dual, then, may be stated 



maximize: cq 

subject to: c e M n +\, where c = [co, Cj, ..., c n ] T 

This formulation is useful in discussion of strong duality results. 
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2. Solvability Conditions 

We move from the infinite case to the case of a finite index set. The 
following results are presented, without formal proof, though they may be found in 
[Ref. 9] or [Ref. 10]. These theorems enable us to determine when the dual problem 
has a solution. That is, we seek to determine when there exists at least one vector of 
our feasible set that minimizes our objective function. Note the distinction between 
solvability and boundedness as defined in the state section above. That is, we may 
have feasible vectors, but no optimal vector in our feasible set. The discussion in 
this section pertains to the finite case of the linear optimization problem. Readers 
interested in an examination of some criteria for the convergence of the LOP in the 
case of an infinite index set are referred to [Ref. 11]. 

Theorem 12. [Ref. 9] Let the linear optimization problem, P, be such that 
Aln+\ is closed, and the dual problem, D , is bounded. Then D has a solution. 

The proof of this theorem is straightforward. Recognize that the objective function 
of the dual is / : 7Z n+1 — > 7Z by f(z 0 ,z l ,...,z n ) = zq. Then / is clearly continuous, 
on a compact set, and we conclude the result. 

Theorem 13. [Ref. 9] Any convex cone P defined by a finite number of vectors 
in 7Z n is closed, in that any convergent sequence of vectors in P converges to 
a vector in P. 

Coupling these observations, we conclude that any finite dual pair, 
(P,D), with both P and D consistent, is solveable. That is, both the primal and 
dual have solutions. 
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Figure 12. The Separating Hyperplane H(y,v) of the Set, M, at the Point z. 

3. Separating Hyperplanes 

We now address the final tool that we use to eliminate a duality gap 
in the linear program. Let H(y,v) = {x t TV : y T x = t/}. Then the hyperplane, 
H(y,v), is said to separate z, a vector not in M, from the convex set, M, if 

y ‘x < v < y z, 

for all x t M. Figure 12 illustrates one seperating hyperplane between the point z 
and the set M which is contained in TV . Let Zo be the vector in M closest to z in 
the Euclidean norm sense. Let y = z — Zo, and let v = 0. Then H( y, u) is the line 
orthogonal to y at the point z 0 . 

Theorem 14. The Separation Theorem [Ref. 9] Define ||x|| to be standard 
Euclidean 2-norm. Let M C TV be a non-empty, closed convex set, and let z 
not be in M. Further, let Zo be the unique vector in M such that ||z — Zq|| < 



64 



||z — x || for all x e M. 1 Finally, let y = z — Zo, and if = (z — z 0 ) T Zo. TVffn //if 
hyperplane. H(y,v) separates z /com A/. 

Proof: Let x e M, and fix 0 < p < 1. Then 

( 1 - /i)z 0 + //x = z 0 + p(x - z 0 ) t M, 

as M is a convex set. Further, 

||z-z 0 || 2 < ||z - (zo + p(x - z 0 ))|| 2 

= 1 1 z - z 0 || 2 - 2p{z - z 0 ) T (x - Zo) + P 2 \\x - z 0 1 1 2 ' 



which implies that 



Let p — > 0. Then 



(z - z 0 ) T (x - z 0 ) < ^/x||x - z 0 || 2 . 



(z — z 0 ) T x < if, for any x t M. 



Then by the definition of a separating hyperplane, we have only to show that u < y T z. 
Since z is not in M, 

0 < ||z — z 0 || 2 = (z - z 0 ) T (z - Zo) 



T T T 

= yz — yz 0 = yz — if. 



□ 



The separating hyperplane defined above is a necessary tool in the elim- 
ination of duality caps in the finite linear optimization problem. 



'That such a unique vector exists is proven in [Ref. 9]. 



65 



4. The Strong Duality Theorem 

We close this section with a statement and proof of a fundamental 
theorem of linear optimization, which states sufficient conditions for the absence of a 
duality gap in the dual pair, (P, D). 



Theorem 15. [Ref. 8] Let the dual pair, (P, D) satisfy the following assum- 

tions. 

1. The dual problem is consistent and has a finite value V{D). 

2. The moment cone, M n + 1 is closed. 

Then (P) is consistent, and V(P) = V(D). That is, no duality gap occurs. 

Proof: Let let the vector, c = [co, Cj, . . . , c n ] T e M n+ i, be an optimal solution of 
the dual problem. Then, for any £ > 0, the vector, c' = [c 0 + £, Ci, . . . , c n ] T is 
not in M n +i. As we are assuming that M n+J is closed, we conclude that there is 
a hyperplane separating the vector c' from M„+i. Consequently, there exists some 
vector y = [yo,yi, • • • , y n ] T £ 7£ n+1 , with y / 0, such that 

n n 

X rVr < 0 < 2/o(co + e) + ^2 ^ y 
r=0 r=l 

for x = [x 0 , Xj, . . . , x n ] T c M n+ j. Let x = c. Then y 0 £ > 0, implying yo > 0. Now let, 

x = [b(s),ai(s), . . . ,a n (s)] T e A' C M n+ ]. 



where s t S, and a r (s) is the r th component of the constraint vector associated with 
s. Then 



^o r (s) 

r= 1 




> *(«), 
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which implies. 



y = 



-yi -y n 

y 0 yo 



c 7v n 



is feasible for the primal, P. Further, 



0 < yo{co -f s) + ^ c r y r . 



r = l 



Applying the duality lemma, we conclude that 



V(P) < E^y'r < c o + £ = y(D) + e < V(P)+e, 



r = l 



implying 



V(P) -£< V(D) < V(P), 



for any e. 



Of a final note, if the index set of our constraints is finite, then we may 
conclude immediately that no duality gap exists in the dual pair, (P, D). This follows 
directly from the above theorem in conjunction with Theorems 12 and 13. 



F. THE SIMPLEX ALGORITHM 

We present a very brief introduction to the Simplex algorithm, and use it to 
solve a simple LOP. This section is not intended to illustrate the implementation 
of the algorithm in any specific form. Rather, this section attempts to explain the 
algorithm as it exploits the results of the duality concepts above. The problem is 
assumed to be infinite-dimensional in this presentation. 
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We begin with a problem, P, of the form: 



Minimize: Z!r=i c rJA- 

subject to: £"=i a r (s)y r > 6(s), for all s t S. 

Then we write the dual, D: 

Maximize: £?=i b(s t )xi 

subject to: 5Z" =1 a r(<s,)a:,' = tv, r = 1, 2, . . . , n 

S{ £ 5 , Xi > 0 . 

Choose some subset, a = {sj, s?, . . . , s n ) C S, and a vector x = [xj , X 2 , . . . , x n ] T 
that is feasible for the dual. The methods to arrive at an initial feasible vector, 
provided one exists, may be found in any Linear Program text. In particular, the 
reader is referred to [Ref. 7]. We derive a vector y from our choice of <7, which is 
associated with the primal problem. As a matter of convenience, we abbreviate this 
set of values, {<7, x,y). We require that the vectors, a(s;) be linearly independent. 
That we may always find a set of linearly independent vectors is assumed in this 
presentation. 

Forming our matrix A as before, we know that the linear independence of the 
vectors ensures that there is a unique vector, x satisfying: 

Ax = c, 

since we are feasible in the dual. Define the discretized primal to be the linear 
program that results in considering only the finite subset of the index set S. Let 
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A(si , s 2 , . . . , 5 n ) = [a(si), a(s 2 ), a s n )], with b(si , . . . , s n ) defined in the same man- 

ner. From the discretization, o\ we for a vector, y, that is feasible for tl lis- 
cretized primal, P. We note that one such vector, y solves the equation: 

A T (s 1 ,s 2 , . . . , s n )y = b(s 1 .s 2) . • • ,s n ). 



Then 

y = (A T ) -1 b. 



The set of values of a and the vector y that Is formed in the manner above is called 
a basic solution of the LOP. The steps of the algorithm, to this point are: 



1. Select a subset, a C S, such that the vectors, a(si), a(s 2 ), . . . , a(s n ) 
are linearly independent. 

2. Compute the unique non-negative solution to the equation, Ax - ~. 

3. Compute the solution to the system, A r y — b, 
for the discretized primal. 



Return to the problem of approximating the exponential with a quadratic poly- 
nomial the interval [0,3]. We have formulated the problem with the constraint 
vectors of the index sets, a(t + ), and a(t“), given by 





1 




-1 




t 




-t 


a(t + ) = 


t 2 


, and a(t ) = 


-t 2 




1 




1 



Additionally, the constraint scalars were defined to be b(t + ) = e‘, and b(t ) = — e‘, 
and the objective function vector was given by c = [0,0,0, 1] T . The problem is 
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minimize: c y 

subject to: a(t + ) T y > b(t + ) 

a(t _ ) T y > b(t~) for all t t T 
over all y c 'R 4 . 

Step One: Arbitrarily select a to be composed of the union of the sets <J\ = 
{0,2} C T~ and a 2 = {1,3} C T+. 

Step Two: Compute the solution of the system 

- 11-11 
0 1-23 
0 1-49 
11 11 

The solution of this system is given by 



”1 






" 




Xi 




0 




^2 




0 




2-3 




0 




£4 




1 



(V.15) 



X = 



13 3 1 

L8 8 8 8 



Step Three: Compute the solution of 
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0 1 
1 1 
-4 1 
9 1 
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— e° 




2/2 




e 1 




2/3 




— e 2 




k 




e 3 



(V.16) 
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Figure 13. A First Approximation of the Exponential Function. 



The vector y = [1.6342 —2.2946 2.7445 .6342] T is the unique solution of this system. 
That is, y is feasible for the discretized primal. The first approximation is given by 

p 2 (x) = 1.6342 - 2.2946x + 2. 7445x 2 . (V.17) 

The graph of the exponential versus the approximation is given in Figure 13. 

We here introduce a lemma that offers us a termination criteria for the algo- 
rithm. 



Theorem 16. The Complementary Slackness Theorem [Ref. 9] Let 
the set, {a, x,y} be as above. If the vector y is feasible for the non- discretized 
primal P , and the following holds: 

f ' a r (s,' )y r for r — 1 , 2, . . . , n. 

Then we may conclude, that if the vector, y, as determined in step 3, is feasible 
for the primal , P , we have found the optimal vector in our problem. 
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Figure 14. Absolute Error in the First Exponential Approximation. 

In the current approximation problem, we find that the current solution does 
not satisfy this criteria. We observe the graph of the absolute difference between the 
functions and find that the error exceeds .6342 over the latter portion of the interval. 
See Figure 14. 

The remainder of the algorithm is a sequence of exchange steps that replace 
existing elements of the set, cr, with elements that improve the value of the dual 
problem, Z), and consequently, improve the bound of V(P). The method of selecting 
new elements to the set, cr, may change with implementation, but it should be noted 
that exactly one element of the set cr is replaced at a given step, in any implementation. 
Recalling from our discussion of extreme points of our feasible set, that strategy 
ensures that the algorithm looks to adjacent extreme points for optimality. 

We conduct one such exchange. Note that the error is most severe at the 
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Figure 15. Absolute Error in the Second Exponential Approximation. 



point t = 3. Then it is logical to seek a better solution at that point. Then we let 
o’! = {0,3}. The new system of equations requiring a solution in step 3 is 



-10 0 1 
1111 
-1 -3 -9 1 
13 9 1 

The solution of the above system is given by 



“l 


~ 




- 




y i 




— e° 




y 2 




e 1 




y 3 




— e 3 




k 




e 3 



y = 



i n T 



(VMS) 



We find that the error is decreased. The absolute error is given in Figure 15. 

We note that the solution is not feasible for the entire interval, since there 
exist points where the error exceeds .5. Thus, we would look to adjacent extreme 
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point solutions and repeat the process until we arrive at a discretized solution that is 
feasible throughout the interval. 



74 



VI. RECONSTRUCTION FORMULATION 

AND SOLUTION 

A. OVERVIEW 

Having laid the complete foundation, we formulate the image reconstruction 
optimization problem. The first portion of this chapter addresses the conceptual 
aspects of the problem, while in the latter portion we use the Simplex algorithm to 
solve a simple reconstruction problem. We conclude the chapter with a brief discussion 
of the merits and drawbacks of a Linear Programming approach to the reconstructii 
problem. 

B. TARGET FUNCTIONS AND NORMS 

The problem we wish to solve is to find the density function, /, that produces 
the observed sampled Radon transform. As the problem is ill-posed, we must define 
some preference function by which to compare the quality of the infinitely many 
density functions that satisfy the above requirement. We do so by specifying some 
function, g , defined over f), which is assumed to represent the most likely density 
of the image. That is, of all density functions that produce the observed transform 
data, we seek that which is most like what we expect to find. How we determine the 
function g is not a matter of discussion here. We only assume that we know some 
such function. 
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The problem of how to determine the best solution becomes one of finding 
the density function that produces the observed transform that is “closest” to g in 
some sense. We choose the infinity norm, or max norm, to measure closeness. Let 
P be a polygonal partition of the compact set f) C 7^ 2 , consisting of the n polygons, 
S\ , ^ 2 , . . . , s n . Recall that the function is defined to be the characteristic function 

of polygon j in P. Imposing the restriction that the optimum density be constant over 
each polygon, the density takes on the form, 

/M = '52 a M uj )- 

j=i 

We seek a density, / defined over f l that minimizes the following: 

ll/H- 0 (w)||oo = max {max | -pH |}. (VI. 1) 

j=l,2,...,n t wts j J 

We also choose some £ > 0 and insist that 

fp — b < £, and 

/ > 0, 

where the vector inequality is componentwise. Recall that fp is defined to be the 
sampled transform of the density / for partition, P. The vector b is the observed 
sample Radon transform. The non-negativity constraint stems from the physical 
nature of the problem. That is, we do not accept solutions that attribute negative 
density to physical objects. 

Before continuing, let us consider the objective function of equation (VI. 1). 
Recall that our attention is fixed on density functions defined on H, a compact subset 
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of TZ 2 . Let us first fix our attention on some polygon, s.,, in the polygonal partition P. 
Let Mj denote the largest absolute difference between our target function, g and the 



scalar, Qj. that we associate with the polygon, Sj. That is, f(u>) = Qj. for all u? e s } . 
The term 

max I a, — qiu) I 

u n Sj ' J ' 1 

is well defined, as both functions are piecewise continuous over the compact set, s y 
The objective function is defined to be the largest of the M } values over all polygons. 
As the problem is not linear, we write an equivalent formulation: 



minimize: 



ubject to: ||/p - b||oo 


< i 






Qj k 


> g(u)rfj(u), 


for all u e Ll, 




-Qj + k 


> -g(u)xbj{u), 


for all u e fl, 




Qj 


> 0, for all j. 




(VI.2) 



Suppose the target function, g , is chosen to be continuous over fi, and further 
suppose that gp = b, where gp is the sample transform of the target density, g. 
If the method is to prove worthwhile, we expect that the test density function is 
optimal. That is, if the test and target densities are the same, we can expect to find 
an arbitrarily good approximation of the test density. We state the above formally. 



Theorem 17. Let g be a non-negative, continuous function defined on the 
set fL Additionally, let values for £ > 0 and £ > 0 be given. Then there exists 



some partition , P, <?/ rz polygons, and an associated function , / = CljxL'j, 

so that the optimum value of the linear optimization problem: 



minimize: 
subject to: 



k 



||/p - <7p|I°o 


< 


£ 


(VI.3) 


atj + k 


> 


g(u>)if>j, for all u t f), 


(VI-4) 


-ex, + k 


> 


—g(io)xfj, for all u t fl. 


(VI.5) 


ay 


> 


0 , for all j 


(VI- 6 ) 



is less than e. 



Proof: We show that we may find a feasible vector for any value of k , and con- 
sequently, for k < e. The proof depends on the continuity of the sample Radon 
transform. That is, let g be any continuous function defined on f2, and let £ > 0 be 
given. Then we wish to show that there exists some 5\ > 0 and some partition P$ 1 
such that the following property holds: 



11/ - slloo < 6\ =► II/p,, - gp h !|oo < £• 

Let h(u>) =| f(oj) — g(u) \ . Recall that for a fixed partition, a single integral over a 
strip, q , defining the sample transform takes on the form 

K = Y. (j s h (“) 7 , 

where 7 q (u) is the characteristic function of the q th strip. Let M denote the area of 
the largest polygon in our partition, choose our to be less than That is, let 
the functions / and g differ by no more than in the uniform norm sense. We have 
already proven that we may do so for some partition. Then we know for each element 
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of our sample data vector 



K = it (/ /l M • 
- Ir (/, it ^ )<Ll ) ' 



< E 

t=i 

< £. 



M n 



M , 



(VI.7) 



As VI.7 holds for each of the finite number of sample integrals, we may conclude that 

II fp t , ~ 9Pi , ||oo < £• 

Thus, if we can disregard constraints, VI. 4, VI. 5, and VI. 6, for any £ > 0, we 
may find a partition Pj, that ensures 



IIZ-SlIoo < 



so that 



ll/p., - gp tl lloo < e. 

This implies that for any value of £, constraint (VI. 3) is met. 

Temporarily disregarding the constraint \\fp — pp||oo < £, we have the less 
restrictive optimization problem: 

mininize: k 

subject to: ay + k > g(u)ipj(uj), for all u t Q, 

— Oj + k > — for all u> c f2, 

a } > 0, for all j. (VI. 8) 
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That the value of the above optimization problem may be made arbitrarily small 
is a direct result of the fact that we may represent any continuous function, g on 
D arbitrarily well by a function of the desired form, per Theorem 2. Denote the 
partition for which the value of k is less than e by Ps 2 . The term 5 2 is the largest 
largest difference between two points in the same polygon of P. That is, 

x,yeSjcPs 2 => \\x - < S 2 . 

Finally, choosing S to be the min{£i, <^ 2 }, constraint VI. 3 is met, as are con- 
straints VI. 4 through VI. 6 for k < e. Then the problem is feasible. As this is a finite 
dimensional problem, we employ Theorem 13 to ensure that a solution to the problem 
exists. Therefore, the optimal value of the optimization problem is less than e. □ 

From the above claim, we expect that if our partition is sufficiently fine, then 
we may reasonably expect to find an acceptable approximation to the solution of our 
optimization problem. 

C. PROBLEM STANDARDIZATION 

We wish to understand the above formulation as it relates to our definition of 
the general linear optimization problem. Before proceeding, it is vital to note that 
we are formulating the problem after we have generated a polygonal partition, P, of 
ft. Throughout this section, we assume that P contains the n polygons, Sj, . . . , s n . 
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1. Inner Product Constraints: Refining the Feasible 
Set 

Before considering the constraints themselves, recall that the polygonal 
partition, P, forms an n-dimensional basis for a subset of the space of functions from 
which we select our optimal function. We may, consequently, think of any density 
function / as a vector y c lZ n , where the j th component of y is the scalar value of the 
density on polygon, s : . For reasons that become clear shortly, we augment the vector 
of decision variables to be y = [qi, < 22 , . . . , a n , k] T e 7£ n+1 . 

We divide our constraints into three distinct classes: 

1) Strip based constraints, 

2) Polygon based constraints, and 

3) based constraints. 

First consider the strip based constraints. We require that the sample 
transform of the optimal objective density be within a specified tolerance of the 
observed sample transform. The constraints were identified in the previous section 
by the equation 

Wfp-bU < £. 

Eliminating the norm above results in the two constraints 

-fp > -b-e, (VI. 9) 

and 

fp > b-e. (VI. 10) 
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Let m denote the number of strips used to generate the partition. P. 
Let Q = { <71 , <72 7 - • • i Qm} be the set of such strips. Then for each q c Q, we require 
that the j th component of our constraint vector be determined by the following rule: 

aj (qi) = area (sj) max {7 q (u>) V’j(^)} • 

UJtil 

Of course, this convention is the same as that of Chapter III. The j th component of 
the vector, a(q;) is the area of the j th polygon if the polygon falls within strip q t , and 
zero otherwise. As before, we wish to consider the two constraints associated with 
each strip separately, and define Q~ and Q + to index constraints (VI. 9 ) and (VI.IO) 
respectively. 

The right hand side of each constraint is also determined as in Chapter 
III. That is, b(q — ) and b(q + ) as in equations (VI. 9 ) and (VI.IO), where 6 , is the data 
from strip q , of our sample transform. We append a zero to each strip based constraint 
vector, as each is independent of the value k. 

The polygon based constraints are found entirely in the requirement 
that our density function be non-negative. In the initial formulation, the requirement 
was written 



oij > 0 , for all j. 



That is, we require that the density assigned to each polygon in the optimal vector 
be non-negative. Let P = {s], S2, . . . , s n } be the fixed partition. Then the constraint 
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vector associated with each polygon is formed by the simple rule: 



for i,j = 1,2, .. . , n. 



aj(si) = max{ipj(uj) 0,(u;)} , 




0, otherwise, 



We append a zero to each a(sj) as in the case of the strip based con- 
straints, as we are selecting a vector from 7£ n+1 . Clearly, 6(s t ) = 0, for all i. Then the 
vector form of each polygon based constraint is: 



(a(si),y) > 0, for i = 1,2, . . . ,n. 

As our third class of constraints corresponds to the set fi, we may 
correctly infer that the final index sets are infinite. These index sets provide the 
constraints that facilitate a comparison of solution quality. There are, in fact, two 
such index sets, fl + and as we again eliminate the explicit use of the infinity 
norm from the formulation. In the initial problem statement, these constraints were 
written 



<*j + k > g{u) 

— ay + k > —g(u) tpj(u), for all u e fl. 



83 



We focus only on the former, as fi” is formulated in a nearly identical 
manner, and the process has been executed in the strip based constraints. We desire 
constraints of the form (a(u; + ),y) > b(u; + ), for all t £1 + . Let 



Oj(^ + ) = for j = 1,2, e ft + , 

a n+ i(w+) =1, for all u + t fl + . 

The associated 6(u> + ) is defined to be g(u + ). The constraints associated with the set 
f l~ are formed in exactly the same manner, with sign changes as appropriate. 

Concluding, we define our index set T = Q + U Q~ U P U Q + U f l~ . 

D. THE IMAGE RECONSTRUCTION DUAL 

The image reconstruction optimization problem, as we have formed it, is a 
specific example of the uniform approximation problem. Consequently, we find some 
strong similarities in its dual problem to the dual of the approximation of the ex- 
ponential function. Let us derive the dual, D , of our image reconstruction problem. 
Note that this section is included in the interest of completeness. The material herein 
is complicated and is not especially enlightening. The reader may wish to skip this 
section. 

We seek a subset, . . . ,t q ~} C T, and the non-negative vector x = 

[xi , x 2 , . . • , x q ] T that maximize the equation 

(b.x) 
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and satisfies 



(x, a(t)) = c. 

We address the selection of the subset first. Consider the strip-based con- 
straints, associated with Q C T. Recall that Q = Q + U Q~ . We seek some subset of 
each of these sets. We denote these subsets Q + , and Q~ . With each element of each 
subset, we associate a non-negative real number, x(q + ), for j = 1 , . . . , n 9 -+ , and x(qj), 
for j = 1 , . . . , n ? - . 

Considering the index set, P, with which we associated the polygon based 
constraints, we seek some non-negative value x(s } ) to associate with each constraint 
of a subset P C P. Following the above convention, we let j = 1, . . . ,np. 

Let us move to the infinite index sets, ft + and f 1~. As we noted above, there 
are two classes of constraint vectors associated with our index sets, fi + and In 
particular, 



1 

I 






i' 2 {u + ) 






: 


and 


; 


0n(w + ) 




1 

c 

1 


1 




1 



For each of our index sets, fi + and , we seek some discretization 
n + = {w+,w+,...,u;+ a+ }, and 

Cl~ = . . . ,u>~ }, as well as non-negative scalars, 
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x(w+),a-(^),...,x(u;+ f!+ ), and 
x (ui),x(ui ),..., a:(a;“ fl _). 



Then the dual D is to find the above discretization and non-negative x values 



that maximize the expression: 



"<?+ n q- n p n n+ "n- 

£ x(q t)Hqt) - £ *(%")&(%") + i,x(si)b(si) + £ x(uf)b{ut)- £ x(u-)b(u~), 



1=1 



1=1 



1=1 



1=1 



1=1 



while satisfying the constraints: 



n Q + n Q - n Cl- 

£ x ( ( l? ) a r{qt £ *(9r)«r(9, r )+£ a; («*) a r(«<)+£ ) a r(w?)-£ x(wr)a r (w,") 



t=l 



t=l 



1=1 



1=1 



i = l 



for r = 1,2, . . . , n, and 



n Q + n Q ~ 

£ X (9. + ) a n+l(9, + ) - £ *(<7,~) a n+l(<7r) 



1=1 



t=l 



n ^, n n + 

+ ( 5 *) d " ) a n+l ) 

1=1 1=1 



n n- 

- £ x(u-)a n+1 (u~) 



i = 1 



= 1 



— 0, for i = 1,2, ...,tiq + , 
*(9f) > 0, for * = 1,2, .. . ,tzq_, 
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x{s,) > 0, 


for 


* = 1.2,. 


. • , Tip, 


x(u+) > 0 , 


for 


*‘=1,2,.. 


■ ■ , 


x(u~) > 0, 


for 


i = 1,2,.. 


,.,n A _ 



While the above formulation of the dual is intimidating, we may simplify im- 
mediately by recognizing some features of the constraints of our primal problem. We 
know that the scalar, 6(s,) = 0, for all s,. Then the middle term in the objective 
function disappears completely. 

Let us move to the first constraint. The middle sum also collapses to the single 
term x(s r ), as we have defined a r (sj ) to be the Kronecker 5(r,j). The first three 
terms of the second constraint disappear altogether, as we have specified, a n+ i(sj) = 
a n+i (sj) — 0, for all j. The non-negativity constraints remain the same. 

E. A SAMPLE SOLUTION 

We now use the formulation of the image reconstruction problem as a linear 
program to solve a simple problem. We first discuss the geometry of the partition 
that we are using, and then identify some additional simplifying assumptions that 
make the problem more tractable. We introduce the expected density of our sample 
problem, and conclude with the Simplex solution of the problem. 

1. The Partition 

The partition that we use in this example is illustrated in Figure 16, 
where the color of a polygon is a function of its area. Larger values correspond to 
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Figure 16. The Partition of the Sample Problem 

lighter colors. We have chosen the four angles 0, |, and with five strips for each 

angle. The resulting partition consists of 89 polygons. 1 It should be noted that each 
strip has width of Consequently, as views at angles of ^ and ^ require more than 
5 strips to cover the unit square completely, only the portion of the square that falls 
in the five center strips is considered. The rest is ommitted. 

2. A Simplifying Assumption 

Rather than attempt to solve the infinite dimensional problem as de- 
rived in the initial portion of this chapter, we project the target density onto the 
n-dimensional polygonal basis of our partition. That is, we insist that the target 

1 The manner in which the polygons were identified and the areas of each polygon computed is 
not a matter of particular concern here. It is sufficient to state that the symmetry of the partition 
was deeply exploited in a manner which simplifies the problem of polygon identification and area 
computation over 89 polygons to one of many fewer than 89. 
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function be constant on each of the polygons of the partition. This simplification 
reduces the infinite index sets fi + and ft" to finite sets, as we need only consider a 



representative Uj e Sj for each polygon s 3 when determining the norm of the differ- 
ence between our target function and optimal function. Without this assumption, the 
problem is very similar to the infinite problem of approximating the exponential with 
polynomials, which was discussed in more detail when the Simplex algorithm was 
introduced. It is possible that this problem is solveable without this simplification, 
but no attempt is made to solve it in this thesis. 

We choose, in projecting the target function onto our finite dimensional 
space, the density of the function over each polygon divided by the area of the polygon. 
That is, after the target function g is projected into the finite space, it takes on the 
form: 

g = 

j=i 

where 



Pi = 



4 9_dA \ 

4^4 



That is, /3 = the mass of g over polygon Sj, divided by the area of polygon s r 

3. The Target Function 

We now identify the target function of the sample problem. We use the 
simple function, 

g{x,y) = (x- ^ + (y-^) > 
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Figure 17. The Projected Target Function 



for x e [0,1] and y t [0,1]. The projection of the density function is illustrated in 
Figure 17. The particular data for the constant densities assigned to each of the 
strips represent the values which we hope, or expect, to find in the solution of our 
problem, before considering the data. That is, the values are assumed to represent 
the most likely solution to our problem. 

4. The Test Density 



The density function that we use to generate the test data is given by 



the expression 



h(x,y) = 



( x ~ if +{y~ if ’ for ( x ~ if < Te 



1 , 



otherwise. 



The density function is displayed in Figure 18. The values of sample transform 
defining integrals become the right hand side of our equality constraints when we 
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Figure 18. The Test Density Function 



formulate the problem. 

The manner in which we have defined the projection of a density onto 
the finite dimesional space assures us that both the test density h and its projection 
have the same sampled transform. Thus, barring catastrophic rounding error, the 
formulation is always feasible. That is, there must be some density function that 
produces the sampled transform, even after we have projected the test density onto 
the partition. If the sampled transform is uniquely determined, we reconstruct the 
projection perfectly, though the value of the variable k may be quite large. 

As a basis of comparison, we note that the maximum difference between 
the projection of the target density and the test density is given by d = .1949. We 
may certainly expect then, that the optimal density varies by no more than the above 
value of d. 
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Figure 19. The Simplex Solution of the Sample Problem 

The optimal density as determined by the Simplex algoritm is displayed 
in Figure 19. The value achieved for the maximum absolute deviation between the 
target density and the optimal density is d! = .1577. We consider the difference over 
each polygon in Figure 20. 

F. SUITABILITY OF SIMPLEX IN IMAGE RECON- 
STRUCTION 

We briefly consider the merit of using the Simplex algorithm to solve the image 
reconstruction problem. That is, we wish to consider how well the tool we have chosen 
fits our particular job. 

The results of this particular example show the tendency of this formulation 
to spread error over the entire region. This consequence, it is believed, results from 
the use of the infinity norm. We may also question the choice of target functions, 
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Figure 20. Difference over Each Polygon 

and may look to other methods of qualification. However, the optimization problem 
achieves what it is designed to achieve. That is, we have found the density that 
satisfies the minimum deviation in the uniform norm sense. 

We are also forced to consider the substantial data that are required to solve the 
problem. The problem of polygon identification is a difficult one by itself, especially 
in view of the fact that a typical partition for the CAT scan problem is generated by 
200-300 angles with up to 500 strips per angle. With this geometry, we know that the 
number of polygons exceeds 4,000,000. Further, we require the area of each polygon 
be known to solve the problem as we have formulated it. Finally, as each polygon 
gives rise to a variable in our primal problem, we are solving a problem in a subspace 
of 7Z n where n is quite large. On the positive side, we know that we must solve the 
polygon identification and polygonal area problems only once. Further, the matrix, 
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A which results from the formulation above is extremely sparse, which may lead to a 
more rapid solution of the Simplex problem, or invite other methods of solving Linear 
Programs. 

In conclusion, the author contends that the Simplex algorithm fits well con- 
ceptually, but may not suited for the vastness of the problem as it is formulated here. 
Projecting the density functions onto the polygonal partion is conceptually identical 
to selecting finite subsets of an infinite index set. The theorems presented in regard to 
the image reconstruction problem indicate that we may solve the infinite dimensional 
problem through a sequence of finite dimensional problems, when certain conditions 
are met. 

Some alternatives that might warrant future consideration are norms other 
than the infinity norm, or using the Simplex model to refine existing solutions to the 
reimaging problem, where the number of variables is less restrictive. 
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VII. 



CONCLUSION 



In conclusion, we have introduced the Simplex algorithm in a context quite 
apart from its usual applications. The principal vehicles for the exploration of the 
algorithm were three unique applications, each illuminating distinct features of the 
theory underlying the implementation of the Simplex algorithm. 

In particular, we began with a problem of finding orthogonal monic polynomi- 
als over a closed interval. This example led to a very basic Simplex formulation, and 
was solved as a finite dimensional problem. The requirement that the polynomials 
be monic facilitated the relatively simple formulation. Follow on problems to this 
example might be the adaptation of the algorithm to generate an non-polynomial 
orthogonal basis for an infinite dimensional function space, or perhaps to fit the al- 
gorithm to solving the non-linear orthonormal basis generation problem. 

Second, we formulated the problem of approximating a function over a closed 
interval in the uniform norm sense. Unlike the first example, the problem was infinite 
dimensional, in that the formuation required a constraint for every number in the 
uncountably infinite set. This problem proved particularly helpful in illustrating the 
principle of weak duality, and ultimately illustrated the Simplex algorithm itself. The 
special qualities of polynomial approximation were ommitted, though the reader is 
referred to [Ref. 9] for a more complete discussion thereof. Again, potential areas for 



95 



future research might include approximation with functions other than polynomials. 

Each of the above examples were used extensively to illustrate the highlights of 
convexity and duality, upon which the Simplex algorithm is based. The treatment was 
relatively general, though many of the theorems required that the linear optimization 
problem be finite. Work is underway to identify classes of infinite dimensional prob- 
lems which may be solved by a sequence of finite dimensional problems. The reader 
is referred to [Ref. 11] for more complete discussion of this active area of research. 
Highlights include infinite horizon planning, fuzzy set semi-infinite programming, and 
linear programming in control theory. 

Another area of focus in this paper was on the Image Reconstruction problem. 
Again, this is an area of active research. After presenting the requisite background, 
we formulated this problem as an infinite dimensional linear optimization problem, 
and as a special case of the uniform approximation problem. Results were presented 
that indicated that use of Simplex to solve a sequence of linear programs is conceptu- 
ally sound, though not necessarily practical in view of the size of the problem. This 
application of the algorithm, however, is open to more extensive research in a number 
of areas. A different choice of norms by which the quality of density functions is 
measured may eliminate a number of constraints. A technique for formulating opti- 
mization problems with the 2-norm is found in [Ref. 12], and may prove useful in 
this application. The Simplex algorithm may also provide an inexpensive method to 
solve coarser problems, from which one may determine the necessity of constructing 
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more detailed models. Alternately, there may be some utility in using the algorithm 
to solve the reconstruction problem in only small portions of the set over which a 
density is defined. If there is utility in such an application, the logical consequence is 
research of parallel Simplex implementation. 

The potential utility of the Simplex algorithm to unconventional applications 
seems clear. Even when actual implementation of the algorithm is not practical, the 
tools of convexity and duality apply to broader areas of optimization. 
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